Online Non-stationary Boosting
نویسندگان
چکیده
Oza’s Online Boosting algorithm provides a version of AdaBoost which can be trained in an online way for stationary problems. One perspective is that this enables the power of the boosting framework to be applied to datasets which are too large to fit into memory. The online boosting algorithm assumes the data distribution to be independent and identically distributed (i.i.d.) and therefore has no provision for concept drift. We present an algorithm called Online Non-Stationary Boosting (ONSBoost) that, like Online Boosting, uses a static ensemble size without generating new members each time new examples are presented, and also adapts to a changing data distribution. We evaluate the new algorithm against Online Boosting, using the STAGGER dataset and three challenging datasets derived from a learning problem inside a parallelising virtual machine. We find that the new algorithm provides equivalent performance on the STAGGER dataset and an improvement of up to 3% on the parallelisation datasets.
منابع مشابه
Parallel Online Continuous Arcing with a Mixture of Neural Networks
This paper presents a new arcing (boosting) algorithm called POCA, Parallel Online Continuous Arcing. Unlike traditional arcing algorithms (such as Adaboost), which construct an ensemble by adding and training weak learners sequentially on a round-byround basis, training in POCA is performed over an entire ensemble continuously and in parallel. Since members of the ensemble are not frozen after...
متن کاملA Bayesian Framework for Online Classifier Ensemble
We propose a Bayesian framework for recursively estimating the classifier weights in online learning of a classifier ensemble. In contrast with past methods, such as stochastic gradient descent or online boosting, our framework estimates the weights in terms of evolving posterior distributions. For a specified class of loss functions, we show that it is possible to formulate a suitably defined ...
متن کاملA Bayesian Approach for Online Classifier Ensemble
We propose a Bayesian approach for recursively estimating the classifier weights in online learning of a classifier ensemble. In contrast with past methods, such as stochastic gradient descent or online boosting, our approach estimates the weights by recursively updating its posterior distribution. For a specified class of loss functions, we show that it is possible to formulate a suitably defi...
متن کاملAn Online Boosting Algorithm with Theoretical Justifications
We study the task of online boosting — combining online weak learners into an online strong learner. While batch boosting has a sound theoretical foundation, online boosting deserves more study from the theoretical perspective. In this paper, we carefully compare the differences between online and batch boosting, and propose a novel and reasonable assumption for the online weak learner. Based o...
متن کاملPredicting Tags for Stack Overflow Questions
Stack Overflow (https://stackoverflow.com/) is one of the major communitydriven Question and Answer (Q&A) websites, focusing on topics related to computer programming. It has nearly 7 million users, who ask more than 6700 questions every day. Each question can be associated with up to five different tags, which serve as metadata to facilitate information retrieval. In this paper, we consider th...
متن کامل